Lexical and Textual Resources for Sense Recognition and Description
نویسندگان
چکیده
It is common knowledge that the creation of language resources for Language Engineering (LE) applications is a time-consuming, and hence expensive, enterprise. From this knowledge stems the demand for the re-usability of resources, which always remains essential. In this paper we will, however, concentrate on another, complementary, aspect, namely that of combining and extending existing resources by a variety of means and with a minimum of manual interaction. The resources to be discussed below consist of (i) a large lexical database, (ii) a formalized computational lexicon, and (iii) a sense-tagged corpus for Swedish. Some results concerning the semi-automatic annotation of the corpus and examples of a variety of phenomena analysed, such as compounding, will also be given. The annotation has been performed within the framework of the SemTag project, while part of this material has been successfully used in the SENSEVAL-2 exercise. In addition to these three resources, it can be added the background material of the Swedish Language Bank (some hundred million words) that forms the basis for the creation of (i) and partly (ii). Having been developed at our department, the lexical resources can easily be accessed, and, more importantly, can be systematically improved where necessary. It should be noted that this type of work requires close cooperation between specialists in lexicography and language technology.
منابع مشابه
Key Lexical Chunks in Applied Linguistics Article Abstracts
In any discourse domain, certain chunks are particularly frequent and deserve attention by the novice to be initiated and by the expert to maintain a sense of community. To make a relevant contribution to the awareness about applied linguistics texts and discourse, this study attempted to develop lists of lexical chunks frequently used in the abstracts of applied linguistics journals. The abstr...
متن کاملWHU at TAC 2009: A Tri-categorization Approach to Textual Entailment Recognition
This paper describes our system of recognizing textual entailment for RTE-5 challenge at TAC 2009. We propose a textual entailment recognition framework and implement a system of classification which takes lexical, syntactic and semantic features as considered. To improve the performance, some lexical-semantic resources and web knowledge bases are also incorporated in the system. Official resul...
متن کاملNavigating sense-aligned lexical-semantic resources: The web interface to UBY
In this paper, we present the Web interface to UBY, a large-scale lexical resource based on the Lexical Markup Framework (LMF). UBY contains interoperable versions of nine resources in two languages. The interface allows to conveniently examine and navigate the encoded information in UBY across resource boundaries. Its main contributions are twofold: 1) The visual view allows to examine the sen...
متن کاملSemantic and Logical Inference Model for Textual Entailment
We compare two approaches to the problem of Textual Entailment: SLIM, a compositional approach modeling the task based on identifying relations in the entailment pair, and BoLI, a lexical matching algorithm. SLIM’s framework incorporates a range of resources that solve local entailment problems. A search-based inference procedure unifies these resources, permitting them to interact flexibly. Bo...
متن کاملSyntactic/Semantic Structures for Textual Entailment Recognition
In this paper, we describe an approach based on off-the-shelf parsers and semantic resources for the Recognizing Textual Entailment (RTE) challenge that can be generally applied to any domain. Syntax is exploited by means of tree kernels whereas lexical semantics is derived from heterogeneous resources, e.g. WordNet or distributional semantics through Wikipedia. The joint syntactic/semantic mod...
متن کامل